local contrastive representation learning
LoCo: Local Contrastive Representation Learning
Deep neural nets typically perform end-to-end backpropagation to learn the weights, a procedure that creates synchronization constraints in the weight update step across layers and is not biologically plausible. Recent advances in unsupervised contrastive representation learning invite the question of whether a learning algorithm can also be made local, that is, the updates of lower layers do not directly depend on the computation of upper layers. While Greedy InfoMax separately learns each block with a local objective, we found that it consistently hurts readout accuracy in state-of-the-art unsupervised contrastive learning algorithms, possibly due to the greedy objective as well as gradient isolation. In this work, we discover that by overlapping local blocks stacking on top of each other, we effectively increase the decoder depth and allow upper blocks to implicitly send feedbacks to lower blocks. This simple design closes the performance gap between local learning and end-to-end contrastive learning algorithms for the first time. Aside from standard ImageNet experiments, we also show results on complex downstream tasks such as object detection and instance segmentation directly using readout features.
Review for NeurIPS paper: LoCo: Local Contrastive Representation Learning
Weaknesses: The paper claims in the Abstract that "by overlapping local blocks" (i.e. the first proposed method), it "closes the performance gap between local learning and end-to-end contrastive learning algorithms for the first time." However, the presented empirical results can not support the claim. The comparisons with baseline SimCLR in Table-1 are not fair. SimCLR can achieve accuracy of 65.7% without extra layers in the decoder and 67.1% with extra layers according to Table-4. However, Table-1 is comparing SimCLR without extra layers versus the proposed solution with extra layers.
Review for NeurIPS paper: LoCo: Local Contrastive Representation Learning
Reviewers were satisfied by the author's response and clarifications. Discussion phase also contributed to harmonizing their view on the relevance and usefulness of well-working local criteria. As a result, R1 and R5 increased their score. The consensus is that the work is a novel and valuable contribution to research on local un/self-supervised learning criteria, with potential relevance for memory savings and biologically plausible alternatives to backpropagation. The AC agrees and recommends acceptance.
LoCo: Local Contrastive Representation Learning
Deep neural nets typically perform end-to-end backpropagation to learn the weights, a procedure that creates synchronization constraints in the weight update step across layers and is not biologically plausible. Recent advances in unsupervised contrastive representation learning invite the question of whether a learning algorithm can also be made local, that is, the updates of lower layers do not directly depend on the computation of upper layers. While Greedy InfoMax separately learns each block with a local objective, we found that it consistently hurts readout accuracy in state-of-the-art unsupervised contrastive learning algorithms, possibly due to the greedy objective as well as gradient isolation. In this work, we discover that by overlapping local blocks stacking on top of each other, we effectively increase the decoder depth and allow upper blocks to implicitly send feedbacks to lower blocks. This simple design closes the performance gap between local learning and end-to-end contrastive learning algorithms for the first time.